Model selection for Gaussian regression with random design
نویسنده
چکیده
This paper is about Gaussian regression with random design, where the observations are i.i.d., it is known from Le Cam (1973, 1975 and 1986) that the rate of convergence of optimal estimators is closely connected to the metric structure of the parameter space with respect to the Hellinger distance. In particular, this metric structure essentially determines the risk when the loss function is a power of the Hellinger distance. For random design regression, one typically uses as loss function the squared L2-distance between the estimator and the parameter. If the parameter space is bounded with respect to the L∞-norm, both distances are equivalent. Without this assumption, it may happen that there is a large distorsion between the two distances, resulting in some unusual rates of convergence for the squared L2-risk, as noticed by Baraud (2002). We shall first explain this phenomenon and then show that the use of the Hellinger distance instead of the L2-distance allows to recover the usual rates and to perform model selection in great generality. An extension to the L2-risk is given under a boundedness assumption similar to the one in Wegkamp (2003).
منابع مشابه
Slope Heuristics for Heteroscedastic Regression on a Random Design
In a recent paper [BM06], Birgé and Massart have introduced the notion of minimal penalty in the context of penalized least squares for Gaussian regression. They have shown that for several model selection problems, simply multiplying by 2 the minimal penalty leads to some (nearly) optimal penalty in the sense that it approximately minimizes the resulting oracle inequality. Interestingly, the m...
متن کاملTransformed Gaussian Markov Random Fields and 1 Spatial Modeling
15 The Gaussian random field (GRF) and the Gaussian Markov random field (GMRF) have 16 been widely used to accommodate spatial dependence under the generalized linear mixed 17 model framework. These models have limitations rooted in the symmetry and thin tail of the 18 Gaussian distribution. We introduce a new class of random fields, termed transformed GRF 19 (TGRF), and a new class of Markov r...
متن کاملNovel Radial Basis Function Neural Networks based on Probabilistic Evolutionary and Gaussian Mixture Model for Satellites Optimum Selection
In this study, two novel learning algorithms have been applied on Radial Basis Function Neural Network (RBFNN) to approximate the functions with high non-linear order. The Probabilistic Evolutionary (PE) and Gaussian Mixture Model (GMM) techniques are proposed to significantly minimize the error functions. The main idea is concerning the various strategies to optimize the procedure of Gradient ...
متن کاملNegative Selection Based Data Classification with Flexible Boundaries
One of the most important artificial immune algorithms is negative selection algorithm, which is an anomaly detection and pattern recognition technique; however, recent research has shown the successful application of this algorithm in data classification. Most of the negative selection methods consider deterministic boundaries to distinguish between self and non-self-spaces. In this paper, two...
متن کاملFast Forward Selection to Speed Up Sparse Gaussian Process Regression
We present a method for the sparse greedy approximation of Bayesian Gaussian process regression, featuring a novel heuristic for very fast forward selection. Our method is essentially as fast as an equivalent one which selects the “support” patterns at random, yet it can outperform random selection on hard curve fitting tasks. More importantly, it leads to a sufficiently stable approximation of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002